Model Selection

Librispeech fine-tuning

# Librispeech fine-tuning

Wav2vec2 Conformer Rope Large 100h Ft

Wav2Vec2 Conformer model fine-tuned on 100 hours of Librispeech data, incorporating rotary position embedding technology

Speech Recognition

Transformers English

Data2vec Audio Large 100h

Data2Vec is a general self-supervised learning framework applicable to speech, natural language processing, and computer vision tasks. This model is a large-scale model pre-trained and fine-tuned on 100 hours of Librispeech audio data.

Speech Recognition

Transformers English

Data2vec Audio Large 10m

Data2Vec is a general self-supervised learning framework applicable to speech, vision, and language tasks. This large audio model is pre-trained and fine-tuned on 10 minutes of Librispeech data, suitable for 16kHz sampled speech audio.

Speech Recognition

Transformers English

Data2vec Audio Base 100h

Data2Vec is a general self-supervised learning framework applicable to speech, vision, and language tasks. This audio base model was pre-trained and fine-tuned on 100 hours of Librispeech audio data.

Speech Recognition

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase